XAI-Analytics Example Notebook

In [1]:
import os
import xai
import logging as log 
import warnings
import matplotlib.pyplot as plt
import sys, os
from util.commons import *
from util.ui import *
from util.model import *
from util.split import *
from util.dataset import *
from IPython.display import display, HTML
/home/g3no/github/XAI-Analytics/venv/lib/python3.8/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.metrics.scorer module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.metrics. Anything that cannot be imported from sklearn.metrics is now part of the private API.
  warnings.warn(message, FutureWarning)
/home/g3no/github/XAI-Analytics/venv/lib/python3.8/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.feature_selection.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.feature_selection. Anything that cannot be imported from sklearn.feature_selection is now part of the private API.
  warnings.warn(message, FutureWarning)

Load a dataset

For this example we are going to use 'Adult Census Dataset'. It consists of both categorical and numerical features. In the output of the cell below, we can see the first five elements (head) of the dataset.

In [2]:
dataset, msg = get_dataset('census')
08-Nov-20 13:21:58 - Dataset 'census (Adult census dataset)' loaded successfully. For further information about this dataset please visit: https://ethicalml.github.io/xai/index.html?highlight=load_census#xai.data.load_census
08-Nov-20 13:21:58 - 
   age          workclass   education  education-num       marital-status  \
0   39          State-gov   Bachelors             13        Never-married   
1   50   Self-emp-not-inc   Bachelors             13   Married-civ-spouse   
2   38            Private     HS-grad              9             Divorced   
3   53            Private        11th              7   Married-civ-spouse   
4   28            Private   Bachelors             13   Married-civ-spouse   

           occupation    relationship ethnicity   gender  capital-gain  \
0        Adm-clerical   Not-in-family     White     Male          2174   
1     Exec-managerial         Husband     White     Male             0   
2   Handlers-cleaners   Not-in-family     White     Male             0   
3   Handlers-cleaners         Husband     Black     Male             0   
4      Prof-specialty            Wife     Black   Female             0   

   capital-loss  hours-per-week    loan  
0             0              40   <=50K  
1             0              13   <=50K  
2             0              40   <=50K  
3             0              40   <=50K  
4             0              40   <=50K  

Visualize the dataset

There are a lot of data visualisation techniques that can be used to analyze a dataset. In this example we will use three functions offered by the XAI module.

  • The first one shows the imbalances of selected features. In the first plot below, for example, we can see that the majority of samples (people) are white male (gender='Male', ethnicity='White').
  • The second and third plots show correlations between the features. The second one plots the correlations as a matrix, whereas the third one as a hierarchical dendogram.
In [3]:
%matplotlib inline
plt.style.use('ggplot')
warnings.filterwarnings('ignore')

imbalanced_cols = ['gender', 'ethnicity']

xai.imbalance_plot(dataset.df, *imbalanced_cols)
xai.correlations(dataset.df, include_categorical=True, plot_type="matrix")
xai.correlations(dataset.df, include_categorical=True)
08-Nov-20 13:21:58 - No categorical_cols passed so inferred using np.object, np.int8 and np.bool: Index(['workclass', 'education', 'marital-status', 'occupation',
       'relationship', 'ethnicity', 'gender', 'loan'],
      dtype='object'). If you see an error these are not correct, please provide them as a string array as: categorical_cols=['col1', 'col2', ...]

Target

In the cell below the target variable is selected. In this example we will use the column loan as target variable, which shows whether a person earns more than 50k per year. The features are split into a different dataframe.

In [4]:
df_X, df_y, msg = split_feature_target(dataset.df, "loan")
df_y
08-Nov-20 13:22:01 - Target 'loan' selected successfully.
Out[4]:
0         <=50K
1         <=50K
2         <=50K
3         <=50K
4         <=50K
          ...  
32556     <=50K
32557      >50K
32558     <=50K
32559     <=50K
32560      >50K
Name: loan, Length: 32561, dtype: object

Training the models

In this step three models are going to be trained on this dataset. All of them will be trained on the raw dataset (without any preprocessing). In the output below we could see classification reports for the trained models. The second model achieves the highest accuracity of ~0.84.

  • Model 1: Logistic Regression
  • Model 2: Random Forest
  • Model 3: Decision Tree
In [5]:
# Create three empty models
initial_models, msg = fill_empty_models(df_X, df_y, 3)
models = []

# Train model 1
model1 = initial_models[0]
msg = fill_model(model1, Algorithm.LOGISTIC_REGRESSION, Split(SplitTypes.IMBALANCED, None))
models.append(model1)

# Train model 2
model2 = initial_models[1]
msg = fill_model(model2, Algorithm.RANDOM_FOREST, Split(SplitTypes.IMBALANCED, None))
models.append(model2)

# Train model 3
model3 = initial_models[2]
msg = fill_model(model3, Algorithm.DECISION_TREE, Split(SplitTypes.IMBALANCED, None))
models.append(model3)
08-Nov-20 13:22:02 - Model accuracy: 0.8028457365134609
08-Nov-20 13:22:04 - Classification report: 
              precision    recall  f1-score   support

       <=50K       0.94      0.79      0.86      7414
        >50K       0.56      0.84      0.67      2355

    accuracy                           0.80      9769
   macro avg       0.75      0.82      0.77      9769
weighted avg       0.85      0.80      0.81      9769

08-Nov-20 13:22:05 - Model Model 1 trained successfully!
08-Nov-20 13:22:20 - Model accuracy: 0.8426655747773569
08-Nov-20 13:22:22 - Classification report: 
              precision    recall  f1-score   support

       <=50K       0.89      0.91      0.90      7414
        >50K       0.69      0.64      0.66      2355

    accuracy                           0.84      9769
   macro avg       0.79      0.77      0.78      9769
weighted avg       0.84      0.84      0.84      9769

08-Nov-20 13:22:23 - Model Model 2 trained successfully!
08-Nov-20 13:22:26 - Model accuracy: 0.8143105742655339
08-Nov-20 13:22:27 - Classification report: 
              precision    recall  f1-score   support

       <=50K       0.89      0.86      0.88      7414
        >50K       0.61      0.66      0.63      2355

    accuracy                           0.81      9769
   macro avg       0.75      0.76      0.75      9769
weighted avg       0.82      0.81      0.82      9769

08-Nov-20 13:22:28 - Model Model 3 trained successfully!
In [6]:
model_1 = models[0]
model_2 = models[1]
model_3 = models[2]

Global model interpretations

In the following steps we will use global interpretation techniques that help us to answer questions like how does a model behave in general? What features drive predictions and what features are completely useless. This data may be very important in understanding the model better. Most of the techniques work by investigating the conditional interactions between the target variable and the features on the complete dataset.

Feature importance

The importance of a feature is the increase in the prediction error of the model after we permuted the feature’s values, which breaks the relationship between the feature and the true outcome. A feature is “important” if permuting it increases the model error. This is because in that case, the model relied heavily on this feature for making right prediction. On the other hand, a feature is “unimportant” if permuting it doesn’t affect the error by much or doesn’t change it at all.

ELI5

In the first case, we use ELI5, which does not permute the features but only visualizes the weight of each feature.

  • Model 1
In [7]:
plot = generate_feature_importance_plot(FeatureImportanceType.ELI5, model_1)
display(plot)
08-Nov-20 13:22:28 - Generating a feature importance plot using ELI5 for Model 1 ...

y= >50K top features

Weight? Feature
+0.893 relationship_ Wife
+0.711 marital-status_ Married-civ-spouse
+0.704 occupation_ Exec-managerial
+0.552 occupation_ Prof-specialty
… 15 more positive …
… 31 more negative …
-0.527 workclass_ ?
-0.530 occupation_ ?
-0.532 gender_ Male
-0.550 marital-status_ Divorced
-0.552 occupation_ Machine-op-inspct
-0.558 occupation_ Handlers-cleaners
-0.584 education_ 11th
-0.600 ethnicity_ Black
-0.681 workclass_ Self-emp-not-inc
-0.694 occupation_ Farming-fishing
-0.756 relationship_ Unmarried
-0.896 occupation_ Other-service
-1.171 relationship_ Own-child
-1.225 gender_ Female
-1.318 marital-status_ Never-married
-1.757 <BIAS>
  • Model 2
In [8]:
plot = generate_feature_importance_plot(FeatureImportanceType.ELI5, model_2)
display(plot)
08-Nov-20 13:22:28 - Generating a feature importance plot using ELI5 for Model 2 ...
Weight Feature
0.2241 ± 0.0918 age
0.1064 ± 0.0228 hours-per-week
0.0958 ± 0.1993 marital-status_ Married-civ-spouse
0.0726 ± 0.0358 capital-gain
0.0718 ± 0.0734 education-num
0.0557 ± 0.1490 relationship_ Husband
0.0435 ± 0.1148 marital-status_ Never-married
0.0248 ± 0.0126 capital-loss
0.0175 ± 0.0200 occupation_ Exec-managerial
0.0167 ± 0.0181 occupation_ Prof-specialty
0.0158 ± 0.0566 relationship_ Own-child
0.0140 ± 0.0348 relationship_ Not-in-family
0.0139 ± 0.0383 gender_ Female
0.0120 ± 0.0297 relationship_ Wife
0.0118 ± 0.0039 workclass_ Private
0.0102 ± 0.0174 education_ Bachelors
0.0097 ± 0.0161 occupation_ Other-service
0.0091 ± 0.0029 workclass_ Self-emp-not-inc
0.0090 ± 0.0111 education_ HS-grad
0.0084 ± 0.0233 gender_ Male
… 45 more …
  • Model 3
In [9]:
plot = generate_feature_importance_plot(FeatureImportanceType.ELI5, model_3)
display(plot)
08-Nov-20 13:22:28 - Generating a feature importance plot using ELI5 for Model 3 ...
Weight Feature
0.2937 marital-status_ Married-civ-spouse
0.1696 age
0.1102 education-num
0.0979 capital-gain
0.0876 hours-per-week
0.0281 capital-loss
0.0158 workclass_ Private
0.0096 occupation_ Prof-specialty
0.0090 ethnicity_ White
0.0090 workclass_ Self-emp-not-inc
0.0089 occupation_ Exec-managerial
0.0082 occupation_ Sales
0.0077 occupation_ Craft-repair
0.0069 occupation_ Other-service
0.0066 workclass_ Local-gov
0.0060 occupation_ Tech-support
0.0059 occupation_ Transport-moving
0.0058 gender_ Male
0.0058 workclass_ State-gov
0.0056 ethnicity_ Black
… 45 more …
In [10]:
print(generate_feature_importance_explanation(FeatureImportanceType.ELI5, models, 4))
08-Nov-20 13:22:31 - Generating feature importance explanation for ELI5 ...
Summary:
 The highest feature for Model 1 is relationship_ Wife with weight ~0.893. The 2nd highest feature for Model 1 is marital-status_ Married-civ-spouse with weight ~0.711. The 3rd highest feature for Model 1 is occupation_ Exec-managerial with weight ~0.704. The 4th highest feature for Model 1 is occupation_ Prof-specialty with weight ~0.552. 
 The highest feature for Model 2 is age with weight ~0.224. The 2nd highest feature for Model 2 is hours-per-week with weight ~0.106. The 3rd highest feature for Model 2 is marital-status_ Married-civ-spouse with weight ~0.096, same as 2nd for Model 1 but with different weight. The 4th highest feature for Model 2 is capital-gain with weight ~0.073. 
 The highest feature for Model 3 is marital-status_ Married-civ-spouse with weight ~0.294, similar to 2nd for Model 1 but with different weight. The 2nd highest feature for Model 3 is age with weight ~0.17, same as 1st for Model 2 but with different weight. The 3rd highest feature for Model 3 is education-num with weight ~0.11. The 4th highest feature for Model 3 is capital-gain with weight ~0.098, similar to 4th for Model 2 but with different weight. 

Skater

In this step we use the Skater module, which permutes the features to generate a feature importance plot.

  • Model 1
In [11]:
%matplotlib inline
plt.rcParams['figure.figsize'] = [14, 15]
plt.style.use('ggplot')
warnings.filterwarnings('ignore')

_ = generate_feature_importance_plot(FeatureImportanceType.SKATER, model_1)
08-Nov-20 13:22:34 - Generating a feature importance plot using SKATER for Model 1 ...
08-Nov-20 13:22:34 - Initializing Skater - generating new in-memory model. This operation may be time-consuming so please be patient.
2020-11-08 13:22:58,931 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progress_bar=False
[65/65] features ████████████████████ Time elapsed: 25 seconds
  • Model 2
In [12]:
_ = generate_feature_importance_plot(FeatureImportanceType.SKATER, model_2)
08-Nov-20 13:23:27 - Generating a feature importance plot using SKATER for Model 2 ...
08-Nov-20 13:23:27 - Initializing Skater - generating new in-memory model. This operation may be time-consuming so please be patient.
2020-11-08 13:23:51,057 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progress_bar=False
[65/65] features ████████████████████ Time elapsed: 34 seconds
  • Model 3
In [13]:
_ = generate_feature_importance_plot(FeatureImportanceType.SKATER, model_3)
08-Nov-20 13:24:27 - Generating a feature importance plot using SKATER for Model 3 ...
08-Nov-20 13:24:27 - Initializing Skater - generating new in-memory model. This operation may be time-consuming so please be patient.
2020-11-08 13:24:51,294 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progress_bar=False
[65/65] features ████████████████████ Time elapsed: 22 seconds
In [14]:
print('\n' + generate_feature_importance_explanation(FeatureImportanceType.SKATER, models, 4))
08-Nov-20 13:25:16 - Generating feature importance explanation for SKATER ...
2020-11-08 13:27:16,249 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progress_bar=False
[65/65] features ████████████████████ Time elapsed: 56 seconds
2020-11-08 13:30:12,460 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progress_bar=False
[65/65] features ████████████████████ Time elapsed: 64 seconds
2020-11-08 13:33:17,456 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progress_bar=False
[65/65] features ████████████████████ Time elapsed: 62 seconds
Summary:
 The highest feature for Model 1 is marital-status_ Never-married with weight ~0.102. The 2nd highest feature for Model 1 is gender_ Female with weight ~0.098. The 3rd highest feature for Model 1 is marital-status_ Married-civ-spouse with weight ~0.071. The 4th highest feature for Model 1 is education-num with weight ~0.063. 
 The highest feature for Model 2 is age with weight ~0.129. The 2nd highest feature for Model 2 is hours-per-week with weight ~0.089. The 3rd highest feature for Model 2 is education-num with weight ~0.081, similar to 4th for Model 1 but with different weight. The 4th highest feature for Model 2 is marital-status_ Married-civ-spouse with weight ~0.073, matching 3rd for Model 1 but with different weight. 
 The highest feature for Model 3 is age with weight ~0.156, similar to 1st for Model 2 but with different weight. The 2nd highest feature for Model 3 is marital-status_ Married-civ-spouse with weight ~0.143, alike 3rd for Model 1 but with different weight. The 3rd highest feature for Model 3 is education-num with weight ~0.141, similar to 4th for Model 1 but with different weight. The 4th highest feature for Model 3 is hours-per-week with weight ~0.11, same as 2nd for Model 2 but with different weight. 

Shap

In the cell below we use the SHAP (SHapley Additive exPlanations). It uses a combination of feature contributions and game theory to come up with SHAP values. Then, it computes the global feature importance by taking the average of the SHAP value magnitudes across the dataset.

  • Model 1
In [15]:
from shap import initjs
initjs()

%matplotlib inline
plt.style.use('ggplot')
warnings.filterwarnings('ignore')

generate_feature_importance_plot(FeatureImportanceType.SHAP, model_1)
08-Nov-20 13:34:19 - Generating a feature importance plot using SHAP for Model 1 ...
08-Nov-20 13:34:19 - Initializing Shap - calculating shap values. This operation is time-consuming so please be patient.

  • Model 2
In [16]:
generate_feature_importance_plot(FeatureImportanceType.SHAP, model_2)
08-Nov-20 13:56:12 - Generating a feature importance plot using SHAP for Model 2 ...
08-Nov-20 13:56:12 - Initializing Shap - calculating shap values. This operation is time-consuming so please be patient.

  • Model 3
In [17]:
generate_feature_importance_plot(FeatureImportanceType.SHAP, model_3)
08-Nov-20 14:37:30 - Generating a feature importance plot using SHAP for Model 3 ...
08-Nov-20 14:37:30 - Initializing Shap - calculating shap values. This operation is time-consuming so please be patient.

In [18]:
print(generate_feature_importance_explanation(FeatureImportanceType.SHAP, models, 4))
08-Nov-20 14:56:03 - Generating feature importance explanation for SHAP ...
Summary:
 The highest feature for Model 1 is capital-gain with weight ~0.149. The 2nd highest feature for Model 1 is gender_ Female with weight ~0.137. The 3rd highest feature for Model 1 is marital-status_ Never-married with weight ~0.135. The 4th highest feature for Model 1 is marital-status_ Married-civ-spouse with weight ~0.134. 
 The highest feature for Model 2 is marital-status_ Married-civ-spouse with weight ~0.148, alike 4th for Model 1 but with different weight. The 2nd highest feature for Model 2 is relationship_ Husband with weight ~0.12. The 3rd highest feature for Model 2 is age with weight ~0.076. The 4th highest feature for Model 2 is education-num with weight ~0.075. 
 The highest feature for Model 3 is marital-status_ Married-civ-spouse with weight ~0.398, same as 4th for Model 1 but with different weight. The 2nd highest feature for Model 3 is age with weight ~0.129, matching 3rd for Model 2 but with different weight. The 3rd highest feature for Model 3 is education-num with weight ~0.122, similar to 4th for Model 2 but with different weight. The 4th highest feature for Model 3 is hours-per-week with weight ~0.105. 

Partial Dependence Plots

The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model. A partial dependence plot can show whether the relationship between the target and a feature is linear, monotonic or more complex. For example, when applied to a linear regression model, partial dependence plots always show a linear relationship.

PDPBox

PDPBox is the first module that we use for ploting partial dependence. We will generate two plots, one for only one feature - age and one for two features - age and education-num.

  • Model 1
In [19]:
generate_pdp_plots(PDPType.PDPBox, model_1, "age", "None")
generate_pdp_plots(PDPType.PDPBox, model_1, "age", "education-num")
08-Nov-20 14:56:04 - Generating a PDP plot using PDPBox for Model 1 ...
08-Nov-20 14:56:18 - Generating a PDP plot using PDPBox for Model 1 ...
08-Nov-20 14:56:19 - findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
08-Nov-20 14:56:19 - findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
08-Nov-20 14:56:19 - findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
08-Nov-20 14:56:20 - findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
  • Model 2
In [20]:
generate_pdp_plots(PDPType.PDPBox, model_2, "age", "None")
generate_pdp_plots(PDPType.PDPBox, model_2, "age", "education-num")
08-Nov-20 14:56:29 - Generating a PDP plot using PDPBox for Model 2 ...
08-Nov-20 14:56:55 - Generating a PDP plot using PDPBox for Model 2 ...
  • Model 3
In [21]:
generate_pdp_plots(PDPType.PDPBox, model_3, "age", "None")
generate_pdp_plots(PDPType.PDPBox, model_3, "age", "education-num")
08-Nov-20 14:57:22 - Generating a PDP plot using PDPBox for Model 3 ...
08-Nov-20 14:57:40 - Generating a PDP plot using PDPBox for Model 3 ...

In the two examples below we will use Skater and SHAP for generating PDPs using features: age and education-num.

Skater

  • Model 1
In [22]:
generate_pdp_plots(PDPType.SKATER, model_1, "age", "education-num")
08-Nov-20 14:57:54 - Generating a PDP plot using SKATER for Model 1 ...
2020-11-08 14:58:06,187 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progressbar=False
[1136/1136] grid cells ████████████████████ Time elapsed: 719 seconds
  • Model 2
In [23]:
generate_pdp_plots(PDPType.SKATER, model_2, "age", "education-num")
08-Nov-20 15:10:08 - Generating a PDP plot using SKATER for Model 2 ...
2020-11-08 15:10:20,203 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progressbar=False
[1136/1136] grid cells ████████████████████ Time elapsed: 999 seconds
  • Model 3
In [24]:
generate_pdp_plots(PDPType.SKATER, model_3, "age", "education-num")
08-Nov-20 15:27:01 - Generating a PDP plot using SKATER for Model 3 ...
2020-11-08 15:27:13,612 - skater.core.explanations - WARNING - Progress bars slow down runs by 10-20%. For slightly 
faster runs, do progressbar=False
[1136/1136] grid cells ████████████████████ Time elapsed: 610 seconds

SHAP

  • Model 1
In [25]:
generate_pdp_plots(PDPType.SHAP, model_1, "age", "education-num")
08-Nov-20 15:37:25 - Generating a PDP plot using SHAP for Model 1 ...
  • Model 2
In [26]:
generate_pdp_plots(PDPType.SHAP, model_2, "age", "education-num")
08-Nov-20 15:37:26 - Generating a PDP plot using SHAP for Model 2 ...
  • Model 3
In [27]:
generate_pdp_plots(PDPType.SHAP, model_3, "age", "education-num")
08-Nov-20 15:37:27 - Generating a PDP plot using SHAP for Model 3 ...

Local model interpretations

Local interpretation focuses on specifics of each individual and provides explanations that can lead to a better understanding of the feature contribution in smaller groups of individuals that are often overlooked by the global interpretation techniques. We will use two moduels for interpreting single instances - SHAP and LIME.

SHAP

SHAP leverages the idea of Shapley values for model feature influence scoring. The technical definition of a Shapley value is the “average marginal contribution of a feature value over all possible coalitions.” In other words, Shapley values consider all possible predictions for an instance using all possible combinations of inputs. Because of this exhaustive approach, SHAP can guarantee properties like consistency and local accuracy. LIME, on the other hand, does not offer such guarantees.

LIME

LIME (Local Interpretable Model-agnostic Explanations) builds sparse linear models around each prediction to explain how the black box model works in that local vicinity. While treating the model as a black box, we perturb the instance we want to explain and learn a sparse linear model around it, as an explanation. LIME has the advantage over SHAP, that it is a lot faster.

In [28]:
examples = [] + get_test_examples(model_1, ExampleType.FALSELY_CLASSIFIED, 2)
examples = examples + get_test_examples(model_2, ExampleType.TRULY_CLASSIFIED, 2)
examples
Out[28]:
[3918, 4893, 9308, 5371]
Example 1
In [29]:
print(get_example_information(model_1, examples[0]))
print(generate_single_instance_comparison(models, examples[0]))
Example 3918's data: 
age                                53
workclass                     Private
education                   Bachelors
education-num                      13
marital-status     Married-civ-spouse
occupation            Exec-managerial
relationship                  Husband
ethnicity                       White
gender                           Male
capital-gain                        0
capital-loss                        0
hours-per-week                     40
Name: 14846, dtype: object
Actual result for example 3918:  <=50K

Example 3918 was truly classified by no model and falsely classified by Model 1, Model 2, Model 3.
 For further clarification see the explanations below.

  • Model 1
In [30]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_1, examples[0])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_1, examples[0]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_1, examples[0])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_1, examples[0]))
display(explanation)
08-Nov-20 15:37:28 - Initializing LIME - generating new explainer. This operation may be time-consuming so please be patient.
The prediction probability of Model 1's decision for this example is 0.87. LIME's explanation: 
The feature that largely affects Model 1's positive (1) prediction probability is marital-status= Married-civ-spouse with value of 0.2627.
The feature with the second most substantial impact on Model 1's positive (1) prediction probability is occupation= Exec-managerial with value of 0.1493.
The third most effective feature for the positive (1) prediction probability of Model 1 is education-num > 12.00 with value of 0.105
The 4th feature that impact the positive (1) prediction probability of Model 1 is gender= Male with value of 0.1041
The feature that mainly affects Model 1's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.5733.


The prediction probability of Model 1's decision for this example is 0.87. SHAP's explanation: 
The feature that largely influences Model 1's positive (1) prediction probability is marital-status_ Married-civ-spouse with value of 0.1308.
The feature with the second largest affect on Model 1's positive (1) prediction probability is occupation_ Exec-managerial with value of 0.1295.
The third most influential feature for the positive (1) prediction probability of Model 1 is education-num with value of 0.0691
The feature that primarily impacts Model 1's negative (0) prediction probability is capital-gain with value of -0.0577.
The feature with the second most considerable affect on Model 1's negative (0) prediction probability is capital-loss with value of -0.018.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 2
In [31]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_2, examples[0])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_2, examples[0]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_2, examples[0])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_2, examples[0]))
display(explanation)
08-Nov-20 15:37:31 - Initializing LIME - generating new explainer. This operation may be time-consuming so please be patient.
The prediction probability of Model 2's decision for this example is 0.99. LIME's explanation: 
The feature that mostly affects Model 2's positive (1) prediction probability is marital-status= Married-civ-spouse with value of 0.1166.
The feature with the second biggest influence on Model 2's positive (1) prediction probability is education-num > 12.00 with value of 0.0909.
The third most effective feature for the positive (1) prediction probability of Model 2 is relationship= Husband with value of 0.0522
The feature that mainly changes Model 2's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.4374.
The feature with the second most substantial change on Model 2's negative (0) prediction probability is hours-per-week <= 40.00 with value of -0.081.


The prediction probability of Model 2's decision for this example is 0.99. SHAP's explanation: 
The feature that mainly influences Model 2's positive (1) prediction probability is marital-status_ Married-civ-spouse with value of 0.3074.
The feature with the second most substantial impact on Model 2's positive (1) prediction probability is relationship_ Husband with value of 0.2543.
The third most impactful feature for the positive (1) prediction probability of Model 2 is occupation_ Exec-managerial with value of 0.1677


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 3
In [32]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_3, examples[0])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_3, examples[0]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_3, examples[0])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_3, examples[0]))
display(explanation)
08-Nov-20 15:37:35 - Initializing LIME - generating new explainer. This operation may be time-consuming so please be patient.
The prediction probability of Model 3's decision for this example is 1.0. LIME's explanation: 
The feature that mostly impacts Model 3's positive (1) prediction probability is marital-status= Married-civ-spouse with value of 0.155.
The feature with the second most considerable affect on Model 3's positive (1) prediction probability is education-num > 12.00 with value of 0.1266.
The feature that mostly influences Model 3's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.6419.
The feature with the second most considerable impact on Model 3's negative (0) prediction probability is capital-loss <= 0.00 with value of -0.1506.
The third most important feature for the negative (0) prediction probability of Model 3 is hours-per-week <= 40.00 with value of -0.0776


The prediction probability of Model 3's decision for this example is 1.0. SHAP's explanation: 
The feature that mostly affects Model 3's positive (1) prediction probability is marital-status_ Married-civ-spouse with value of 0.7939.
The feature with the second most substantial affect on Model 3's positive (1) prediction probability is relationship_ Husband with value of 0.2097.
The third most effective feature for the positive (1) prediction probability of Model 3 is hours-per-week with value of 0.0788
The feature that mainly influences Model 3's negative (0) prediction probability is education-num with value of -0.1058.
The feature with the second most substantial impact on Model 3's negative (0) prediction probability is age with value of -0.0099.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
Example 2
In [33]:
print(get_example_information(model_1, examples[1]))
print(generate_single_instance_comparison(models, examples[1]))
Example 4893's data: 
age                                47
workclass                     Private
education                   Bachelors
education-num                      13
marital-status     Married-civ-spouse
occupation             Prof-specialty
relationship                  Husband
ethnicity                       White
gender                           Male
capital-gain                        0
capital-loss                        0
hours-per-week                     50
Name: 29428, dtype: object
Actual result for example 4893:  <=50K

Example 4893 was truly classified by no model and falsely classified by Model 1, Model 2, Model 3.
 For further clarification see the explanations below.

  • Model 1
In [34]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_1, examples[1])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_1, examples[1]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_1, examples[1])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_1, examples[1]))
display(explanation)
The prediction probability of Model 1's decision for this example is 0.87. LIME's explanation: 
The feature that mostly influences Model 1's positive (1) prediction probability is marital-status= Married-civ-spouse with value of 0.2715.
The feature with the second most substantial affect on Model 1's positive (1) prediction probability is gender= Male with value of 0.0973.
The third most impactful feature for the positive (1) prediction probability of Model 1 is education-num > 12.00 with value of 0.0905
The feature that primarily impacts Model 1's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.5908.
The feature with the second largest influence on Model 1's negative (0) prediction probability is capital-loss <= 0.00 with value of -0.2169.


The prediction probability of Model 1's decision for this example is 0.87. SHAP's explanation: 
The feature that mostly changes Model 1's positive (1) prediction probability is marital-status_ Married-civ-spouse with value of 0.1318.
The feature with the second most substantial impact on Model 1's positive (1) prediction probability is occupation_ Prof-specialty with value of 0.1019.
The third most influential feature for the positive (1) prediction probability of Model 1 is education-num with value of 0.0696
The feature that mainly affects Model 1's negative (0) prediction probability is capital-gain with value of -0.0582.
The feature with the second most substantial impact on Model 1's negative (0) prediction probability is capital-loss with value of -0.0181.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 2
In [35]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_2, examples[1])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_2, examples[1]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_2, examples[1])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_2, examples[1]))
display(explanation)
The prediction probability of Model 2's decision for this example is 0.62. LIME's explanation: 
The feature that mainly influences Model 2's positive (1) prediction probability is marital-status= Married-civ-spouse with value of 0.1211.
The feature with the second largest impact on Model 2's positive (1) prediction probability is education-num > 12.00 with value of 0.0866.
The third most important feature for the positive (1) prediction probability of Model 2 is hours-per-week > 45.00 with value of 0.0794
The 4th feature that affect the positive (1) prediction probability of Model 2 is relationship= Husband with value of 0.0648
The feature that mainly affects Model 2's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.4479.


The prediction probability of Model 2's decision for this example is 0.62. SHAP's explanation: 
The feature that mainly impacts Model 2's positive (1) prediction probability is relationship_ Husband with value of 0.2.
The feature with the second most considerable influence on Model 2's positive (1) prediction probability is marital-status_ Married-civ-spouse with value of 0.1616.
The third most influential feature for the positive (1) prediction probability of Model 2 is occupation_ Prof-specialty with value of 0.0817
The feature that primarily changes Model 2's negative (0) prediction probability is age with value of -0.029.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 3
In [36]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_3, examples[1])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_3, examples[1]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_3, examples[1])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_3, examples[1]))
display(explanation)
The prediction probability of Model 3's decision for this example is 0.76. LIME's explanation: 
The feature that mainly changes Model 3's positive (1) prediction probability is marital-status= Married-civ-spouse with value of 0.1739.
The feature with the second largest affect on Model 3's positive (1) prediction probability is hours-per-week > 45.00 with value of 0.0953.
The third most important feature for the positive (1) prediction probability of Model 3 is education-num > 12.00 with value of 0.0884
The 4th feature that impact the positive (1) prediction probability of Model 3 is 37.00 < age <= 48.00 with value of 0.0657
The feature that largely impacts Model 3's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.6779.


The prediction probability of Model 3's decision for this example is 0.76. SHAP's explanation: 
The feature that largely impacts Model 3's positive (1) prediction probability is marital-status_ Married-civ-spouse with value of 0.7219.
The feature with the second most considerable change on Model 3's positive (1) prediction probability is occupation_ Prof-specialty with value of 0.0849.
The third most important feature for the positive (1) prediction probability of Model 3 is age with value of 0.0442
The feature that mostly affects Model 3's negative (0) prediction probability is education-num with value of -0.0772.
The feature with the second largest impact on Model 3's negative (0) prediction probability is education_ Bachelors with value of -0.0441.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
Example 3
In [37]:
print(get_example_information(model_1, examples[2]))
print(generate_single_instance_comparison(models, examples[2]))
Example 9308's data: 
age                           33
workclass                Private
education                HS-grad
education-num                  9
marital-status          Divorced
occupation          Craft-repair
relationship       Not-in-family
ethnicity                  White
gender                      Male
capital-gain                   0
capital-loss                2258
hours-per-week                84
Name: 4091, dtype: object
Actual result for example 9308:  <=50K

Example 9308 was truly classified by Model 2, Model 3 and falsely classified by Model 1.
 For further clarification see the explanations below.

  • Model 1
In [38]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_1, examples[2])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_1, examples[2]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_1, examples[2])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_1, examples[2]))
display(explanation)
The prediction probability of Model 1's decision for this example is 0.7. LIME's explanation: 
The feature that mainly changes Model 1's positive (1) prediction probability is capital-loss > 0.00 with value of 0.1865.
The feature with the second most substantial influence on Model 1's positive (1) prediction probability is hours-per-week > 45.00 with value of 0.112.
The third most effective feature for the positive (1) prediction probability of Model 1 is gender= Male with value of 0.1034
The feature that largely influences Model 1's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.5878.
The feature with the second largest affect on Model 1's negative (0) prediction probability is education-num <= 9.00 with value of -0.0888.


The prediction probability of Model 1's decision for this example is 0.7. SHAP's explanation: 
The feature that mainly changes Model 1's positive (1) prediction probability is capital-loss with value of 0.2883.
The feature with the second biggest affect on Model 1's positive (1) prediction probability is hours-per-week with value of 0.2095.
The feature that mainly affects Model 1's negative (0) prediction probability is marital-status_ Divorced with value of -0.1133.
The feature with the second most substantial impact on Model 1's negative (0) prediction probability is capital-gain with value of -0.0666.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 2
In [39]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_2, examples[2])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_2, examples[2]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_2, examples[2])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_2, examples[2]))
display(explanation)
The prediction probability of Model 2's decision for this example is 0.7. LIME's explanation: 
The feature that mainly impacts Model 2's positive (1) prediction probability is hours-per-week > 45.00 with value of 0.0865.
The feature with the second biggest change on Model 2's positive (1) prediction probability is capital-loss > 0.00 with value of 0.0733.
The feature that mainly affects Model 2's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.4527.
The feature with the second largest influence on Model 2's negative (0) prediction probability is education-num <= 9.00 with value of -0.0812.
The third most impactful feature for the negative (0) prediction probability of Model 2 is marital-status= Divorced with value of -0.0417


The prediction probability of Model 2's decision for this example is 0.7. SHAP's explanation: 
The feature that largely impacts Model 2's positive (1) prediction probability is age with value of 0.0917.
The feature with the second biggest affect on Model 2's positive (1) prediction probability is marital-status_ Divorced with value of 0.0294.
The third most impactful feature for the positive (1) prediction probability of Model 2 is education-num with value of 0.0279
The feature that mainly affects Model 2's negative (0) prediction probability is capital-loss with value of -0.2455.
The feature with the second biggest influence on Model 2's negative (0) prediction probability is hours-per-week with value of -0.1274.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 3
In [40]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_3, examples[2])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_3, examples[2]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_3, examples[2])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_3, examples[2]))
display(explanation)
The prediction probability of Model 3's decision for this example is 1.0. LIME's explanation: 
The feature that mostly affects Model 3's positive (1) prediction probability is hours-per-week > 45.00 with value of 0.0974.
The feature with the second biggest impact on Model 3's positive (1) prediction probability is capital-loss > 0.00 with value of 0.0626.
The feature that mainly impacts Model 3's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.6729.
The feature with the second largest change on Model 3's negative (0) prediction probability is education-num <= 9.00 with value of -0.0868.
The third most impactful feature for the negative (0) prediction probability of Model 3 is marital-status= Divorced with value of -0.0704


The prediction probability of Model 3's decision for this example is 1.0. SHAP's explanation: 
The feature that largely impacts Model 3's positive (1) prediction probability is age with value of 0.5333.
The feature with the second biggest change on Model 3's positive (1) prediction probability is education_ HS-grad with value of 0.0333.
The feature that primarily influences Model 3's negative (0) prediction probability is capital-loss with value of -0.4667.
The feature with the second most substantial influence on Model 3's negative (0) prediction probability is occupation_ Craft-repair with value of -0.05.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
Example 4
In [41]:
print(get_example_information(model_1, examples[1]))
print(generate_single_instance_comparison(models, examples[1]))
Example 4893's data: 
age                                47
workclass                     Private
education                   Bachelors
education-num                      13
marital-status     Married-civ-spouse
occupation             Prof-specialty
relationship                  Husband
ethnicity                       White
gender                           Male
capital-gain                        0
capital-loss                        0
hours-per-week                     50
Name: 29428, dtype: object
Actual result for example 4893:  <=50K

Example 4893 was truly classified by no model and falsely classified by Model 1, Model 2, Model 3.
 For further clarification see the explanations below.

  • Model 1
In [42]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_1, examples[3])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_1, examples[3]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_1, examples[3])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_1, examples[3]))
display(explanation)
The prediction probability of Model 1's decision for this example is 0.94. LIME's explanation: 
The feature that mainly impacts Model 1's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.5779.
The feature with the second biggest influence on Model 1's negative (0) prediction probability is marital-status= Never-married with value of -0.2405.
The third most impactful feature for the negative (0) prediction probability of Model 1 is capital-loss <= 0.00 with value of -0.2108
The 4th feature that affect the negative (0) prediction probability of Model 1 is gender= Female with value of -0.105
The 5th feature that impact the negative (0) prediction probability of Model 1 is hours-per-week <= 40.00 with value of -0.094


The prediction probability of Model 1's decision for this example is 0.94. SHAP's explanation: 
The feature that primarily affects Model 1's positive (1) prediction probability is marital-status_ Never-married with value of 0.2168.
The feature with the second largest affect on Model 1's positive (1) prediction probability is gender_ Female with value of 0.2018.
The third most impactful feature for the positive (1) prediction probability of Model 1 is capital-gain with value of 0.0513
The feature that mostly influences Model 1's negative (0) prediction probability is gender_ Male with value of -0.0827.
The feature with the second largest impact on Model 1's negative (0) prediction probability is ethnicity_ White with value of -0.069.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 2
In [43]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_2, examples[3])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_2, examples[3]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_2, examples[3])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_2, examples[3]))
display(explanation)
The prediction probability of Model 2's decision for this example is 0.99. LIME's explanation: 
The feature that largely impacts Model 2's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.4442.
The feature with the second most considerable impact on Model 2's negative (0) prediction probability is marital-status= Never-married with value of -0.0998.
The third most effective feature for the negative (0) prediction probability of Model 2 is age <= 28.00 with value of -0.088
The 4th feature that impact the negative (0) prediction probability of Model 2 is capital-loss <= 0.00 with value of -0.0804
The 5th feature that affect the negative (0) prediction probability of Model 2 is hours-per-week <= 40.00 with value of -0.0784


The prediction probability of Model 2's decision for this example is 0.99. SHAP's explanation: 
The feature that mostly changes Model 2's positive (1) prediction probability is age with value of 0.0161.
The feature with the second most considerable influence on Model 2's positive (1) prediction probability is hours-per-week with value of 0.0149.
The third most influential feature for the positive (1) prediction probability of Model 2 is gender_ Male with value of 0.0081
The feature that mainly impacts Model 2's negative (0) prediction probability is relationship_ Not-in-family with value of -0.0161.
The feature with the second biggest affect on Model 2's negative (0) prediction probability is education_ Some-college with value of -0.0039.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
  • Model 3
In [44]:
explanation = explain_single_instance(LocalInterpreterType.LIME, model_3, examples[3])
print(generate_single_instance_explanation(LocalInterpreterType.LIME, model_3, examples[3]))
explanation.show_in_notebook(show_table=True, show_all=True)
explanation = explain_single_instance(LocalInterpreterType.SHAP, model_3, examples[3])
print(generate_single_instance_explanation(LocalInterpreterType.SHAP, model_3, examples[3]))
display(explanation)
The prediction probability of Model 3's decision for this example is 1.0. LIME's explanation: 
The feature that mainly changes Model 3's negative (0) prediction probability is capital-gain <= 0.00 with value of -0.6841.
The feature with the second most substantial influence on Model 3's negative (0) prediction probability is capital-loss <= 0.00 with value of -0.1333.
The third most important feature for the negative (0) prediction probability of Model 3 is marital-status= Never-married with value of -0.1215
The 4th feature that affect the negative (0) prediction probability of Model 3 is age <= 28.00 with value of -0.1092
The 5th feature that affect the negative (0) prediction probability of Model 3 is hours-per-week <= 40.00 with value of -0.0861


The prediction probability of Model 3's decision for this example is 1.0. SHAP's explanation: 
The feature that mostly affects Model 3's positive (1) prediction probability is hours-per-week with value of 0.2436.
The feature with the second most substantial influence on Model 3's positive (1) prediction probability is gender_ Male with value of 0.0041.
The third most influential feature for the positive (1) prediction probability of Model 3 is capital-gain with value of 0.003
The feature that mainly affects Model 3's negative (0) prediction probability is relationship_ Not-in-family with value of -0.0838.
The feature with the second most substantial impact on Model 3's negative (0) prediction probability is education_ Some-college with value of -0.0822.


Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.